InternVL3-78B is an advanced multimodal large language model developed by OpenGVLab, demonstrating exceptional comprehensive performance. Compared to its predecessor InternVL 2.5, it possesses stronger multimodal perception and reasoning capabilities, extending its abilities to new domains such as tool usage, GUI agents, industrial image analysis, and 3D visual perception.
Multimodal
TransformersOther